# README for "Replication data for: An Analysis of Data Availability Statements in Qualitative Research Journal Articles"
Derek J. Robey, Dessi Kirilova, Sebastian Karcher, and Nic Weber

## Summary
This data deposit includes data and code to assemble the dataset, generate all figures and values used in the paper and appendix, and generate the codebook. It also includes the codebook and the figures. The analysis.R script and the data in data/analysis are sufficient to reproduce all findings in the paper. The additional scripts and the data files in data/raw are included for full transparency and to facilitate the detection of any errors in the data processing pipeline. Their structure is due to the development of the project over time.

## Running the code
Open the R-project file qualitative-data-availability.Rproj in RStudio. All files in the 'code' folder can be run independently and in any order.
Note that the codebook.Rmd file will generate the codebook in the code folder. We have moved it to the root of the deposit for ease of access.

## Deposit contents

- codebook.html - the codebook for the cleaned, full data set as an html file (will open in any browser)
- README_qualitative-data-availability.md - this file

/code
- analysis.R - reproduces all figures and table values in the paper and appendix
- check-supplements.R - script to check for the existence of supplementary files for PLOS articles that indicate the existence of data in paper of supplementary materials. Generates added_supplements.rds which is merged with the full dataset during cleaning
- cleaning_recoding.R - cleans the dataset and recodes/generates variables as needed. generates `/data/analysis/plos_cleaned_data.csv` and `/data/analysis/plos_cleaned_data.rds`.
- codebook.Rmd - R markdown file to generate a codebook for plos_cleaned_data.rds using the codebook package
- retrieve-metadata.R - script to retrieve metadata for included articles from PLOS to fix text encoding issues that were introduced during manual data coding

/data
- /analysis
    - plos_cleaned_data.rds - the main analysis dataset, fully cleaned and documented in codebook.html
    - plose_cleaned_data.csv - a CSV version of the analysis dataset for interoperability and preservation; not used in the analysis
- /raw
    - added_supplements.rds - generated with the `check-supplements.R` script, as described above
    - citation_counts.csv - citation counts from OpenAlex for all included articles. Code used to obtain the citation counts is in `cleaning_recoding.R`, commented out. The script uses this file to merge in the citation counts.
    - plos_fully_coded.csv - the raw, coded data; exported as CSV from the Google Sheet were manual coding was performed.
    - plos_metadata.csv - metadata retrieved using the `retrieve-metadata.R` script to fix text encoding issues and augment the original data.
    - plos_qualitative_studies.csv - basic metadata of studies used for the `retrieve-metadata.R` script.

/figures
- contains all figures generated in the code; running `analysis.R` regenerates the figures.


## R environment
```
> sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 26100)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] haven_2.5.4       labelled_2.12.0   codebook_0.9.2    lubridate_1.9.2   ggplot2_3.5.1     tibble_3.2.1      httr_1.4.5       
 [8] xml2_1.3.3        purrr_1.0.1       tidyr_1.3.0       readr_2.1.4       countrycode_1.6.0 stringr_1.5.0     dplyr_1.1.2      

loaded via a namespace (and not attached):
 [1] utf8_1.2.3         generics_0.1.3     stringi_1.7.12     digest_0.6.31      hms_1.1.3          magrittr_2.0.3     evaluate_0.20     
 [8] grid_4.3.1         timechange_0.2.0   RColorBrewer_1.1-3 fastmap_1.1.1      fansi_1.0.4        scales_1.3.0       textshaping_0.3.6 
[15] cli_3.6.1          rlang_1.1.4        crayon_1.5.2       bit64_4.0.5        munsell_0.5.0      yaml_2.3.7         withr_2.5.0       
[22] tools_4.3.1        parallel_4.3.1     tzdb_0.3.0         colorspace_2.1-0   forcats_1.0.0      vctrs_0.6.2        R6_2.5.1          
[29] ggridges_0.5.4     lifecycle_1.0.3    bit_4.0.5          vroom_1.6.1        ragg_1.2.5         pkgconfig_2.0.3    archive_1.1.10    
[36] pillar_1.9.0       gtable_0.3.3       glue_1.6.2         systemfonts_1.1.0  xfun_0.48          tidyselect_1.2.0   knitr_1.42        
[43] rstudioapi_0.14    farver_2.1.1       htmltools_0.5.5    rmarkdown_2.21     labeling_0.4.2     compiler_4.3.1 
```